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Abstract 

We present a term rewriting procedure based on congruence closure 
that can be used with arbitrary equational theories. This procedure is 
motivated by the pragmatic need to prove equations in equational theo- 
ries where confluence can not be achieved. The procedure uses context 
free grammars to represent equivalence classes of terms. The proce- 
dure rewrites grammars rather than terms and uses congruence closure 
to maintain certain congruence properties of the grammar. Grammars 
provide concise representations of large term sets. Infinite term sets 
can be represented with finite grammars and exponentially large term 
sets can be represented with linear sized grammars. Although the pro- 
cedure is primarily intended for use in nonconfluent theories, it also 
provides a new kind of confluence that can be used to give canoni- 
cal rewriting systems for theories that are difficult to handle in other 
ways. For example, under grammar rewriting there is a finite canoni- 
cal rewrite system for idempotent semigroups, a theory which has been 
shown not to have any finite canonical system under traditional notions 
of rewriting. 
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1 Introduction 

In most practical applications of term rewriting systems, such as verifications done 
with the Boyer-Moore prover [Boyer and Moore, 1979], confluence can not be 
achieved. In such cases attempts to prove true equations often fail. In this paper 
a new term rewriting procedure is described that is intended to improve the suc- 
cess rate for proof attempts in nonconfluent theories. In cases where normal term 
rewriting fails, success can sometimes be achieved by generating a set of normal 
forms rather than an individual normal form. This is done using a term ordering 
where a single term can have many normal forms all of which are minimal under 
that ordering. Somewhat surprisingly, canonicalization under such weak term or- 
derings is possible. The canonical form of a given term is taken to be the set of all 
normal forms. 

The main problem with using sets as canonical forms is that the sets involved 
can be quite large. In fact, to improve the success rate in nonconfluent theories 
we would like the canonical sets to be as large as possible. Fortunately, large 
sets can be compactly represented with context free grammars. A finite grammar 
can represent an infinite set of terms. More importantly, very large finite term 
sets can be represented by compact grammars. Consider the equational theory 
consisting of the single commutativity axiom of the form x + y = y + x. In this 
theory the equivalence class of a sum of n constants contains order 2 n terms. 
However, a grammar for generating this class contains only order n productions. 
As another example consider the theory consisting of a single associativity axiom 
x + (y + z) = (x + y) + z. In this theory the size of the equivalence class of a sum of 
n constants is the Catalan number of n — a hyperexponential function. However 
the equivalence class of a sum of n constants can be generated by a grammar with 
order n 3 productions. In the equational theory that contains both associativity 
and commutativity the grammar for generating the equivalence class of a sum of 
n constants grows exponentially in n. However, the grammar is still vastly smaller 
than the equivalence class itself. 

Using grammars to represent equivalence classes involves the well known con- 
gruence closure procedure. Congruence closure is an efficient algorithm for de- 
termining the consequences of a finite set of ground equations [Kozen, 1977] 
[Shostack, 1978], [Nelson and Oppen, 1980], [Downey et a/., 1980]. A finite set 
of ground equations can be converted to a grammar that encodes the congruence 
relation on terms implicit in the equations. Congruence closure can be viewed as 
an algorithm for converting ground equations to grammars. The rewriting proce- 
dure described here operates on grammars — a grammar is repeatedly rewritten 
to incorporate new ground equations. 

The rewriting procedure described here is analogous to ordered rewriting sys- 



terns [Bachmair et a/., 1987], [Hsaing and Rusinowitch, 1987], [Martin and Nip- 
kow, 1990], [Peterson, 1990]. It rewrites (representations of equivalence classes 
of) ground terms using a set of unordered equations. This rewriting is done in 
the presence of a well founded order on ground terms and the rewriting process 
is guaranteed to terminate. However, unlike previous ordered rewriting systems, 
the order on ground terms is not assumed to be total. The commutativity equa- 
tion x + y = y + a:canbe handled by placing both sides of the equation into the 
representation of the set being rewritten. Thus the grammar rewriting procedure 
described here provides a way of handling non-orientable equations that is differ- 
ent from both ordered rewriting and from the use of special unification algorithms 
[Jouannaud and Kirchner, 1986]. 

The grammar rewriting procedure described here has been incorporated into 
the Ontic verification system which is under continued development by the au- 
thor, Robert Givan, Carl Witty, and Kevin Zalondek. Experimentation with the 
procedure is currently under way. 



2 Congruence Grammars 

Congruence grammars provide a way of compactly representing equivalence classes 
of first order terms. Each equivalence class is represented by a nonterminal symbol 
of the grammar which generates the elements of the class. Equivalence classes are 
always disjoint sets. This observation motivates the restriction that no two produc- 
tions of a congruence grammar can have the same right hand side. For example, 
we can not have X -> a and Y -> a where X and Y are distinct nonterminal 
symbols. 

Definition: A Congruence Grammar is a set of productions of the 
form X -> /(Yi, . . . , Y n ), where / is an n-ary function symbol and 
each Yi is a nonterminal symbol, and where no two productions have 
the same right hand side. 

We use the standard definition of a first order term where we assume an infinite 
set of function symbols of each arity (number of arguments). Constant symbols 
are treated as function symbols of no arguments. So, for example, the production 
X -» a is a production of the above form where a is function of no arguments 
The above definition allows us to prove that distinct nonterminal symbols generate 
disjoint classes. 

Lemma: A given term is generated by at most one nonterminal of a 
congruence grammar. 



Proof: The proof is by induction on the size of the term. No constant 
(zero ary function) can be generated by more than one nonterminal 
symbol because this would imply the existence of two productions with 
the same right hand side. Now consider a term t such that all terms 
smaller than t are generated by at most one nonterminal symbol. Sup- 
pose that t is generated by two nonterminal symbols X and Y using 
the productions X -* f(Z u - • . Z n ) and Y -+ f{W u • • , W n ) respec- 
tively. Since no term smaller than t can be generated by more than one 
nonterminal symbol we must have that Z, equals W{. But this violates 
the assumption that no two productions have the same right hand side. 



Example, Representing an Infinite Class: Consider the equivalence class of 
the constant symbol a under the single equation a = f(a). This infinite equivalence 
class is a context free language generated by the following two productions. 

X -> f(X) 

X — * a 

Example, An Equivalence Class under Commutativity: Consider n distinct 
constants a u a 2 , • • •, a n and define the term U to be the sum of the first i constants 
associated to the left, i.e., t t is a u t 2 is <n + a 2 , t 3 is ( fll +a 2 ) + a 3 , and so on. Note 
that for 1 < i < n we have that U is t^ + a,. Suppose that + is commutative 
but not associative. In this case the equivalence class of the term t n is generated 
by the nonterminal X n in the grammar containing the 2(n - 1) productions of the 
form 

Xi -► (fli + X,-i) 

X% -> (^t-i + a,-) 
together with the production 

Xi —> CL\. 

The 2"- 1 terms in this equivalence class are generated by a grammar with 2n - 1 
productions. 

Example, An AC Equivalence Class: Consider n constants a u a 2 , •••, a n 
and suppose that + is both associative and commutative. Let S be any non-empty 
subset of these constants. We can write the set S as {a,-,, a t - 2 , •-., a ik } where 
j < h implies i 5 < i h . We define the term t s to be the term (• • • ((a tl +a t - 2 )+- • • a tfc ). 
Let U be the set of all n constants. The equivalence class of the'term t v canbe 
generated by a grammar with nonterminal symbols of the form X s where S is a 



nonempty subset of the constant symbols. This grammar contains all productions 
of the form 

Xs — *■ (Xw + Xy) 

where W and V are disjoint subsets of U and S is W U V. The grammar also 
contains all productions of the form 



X 



{*> 



The equivalence class of t v is generated by the nonterminal X v of this grammar. 
More generally, the nonterminal X s generates the equivalence class of t s . 

The grammar that generates the equivalence class of t v grows exponentially in 
size of U — the grammar contains order 3 n productions. This fact can be derived 
by observing that each production 

Xs — ► (X w + Xy) 

classifies each element of U in one of three ways — either as a member of W, or 
a member of V, or as a member of neither. There are 3 n ways of classifying n 
constants into three groups. There are actually less than 3 n productions because 
the sets W and V in the above production must both be nonempty. The number 
of productions in this grammar should be contrasted with the number of terms in 
the equivalence class. For n = 8 the grammar contains 6,058 productions while 
the equivalence class contains over 17 million terms. 

With no term ordering the grammar rewriting procedure described in section 4 
requires time on the order of 4 n to construct the grammar that generates the 
equivalence class of t v under the equational theory that contains the associativity 
and commutativity axioms for +. It seems that this may be acceptable in a large 
number of rewriting applications. In the theorems of the Boyer-Moore corpus, for 
example, the number of elements of an AC term rarely exceeds three [Boyer and 
Moore, 1979]. For n = 5 we have that 4 n is about a thousand, a small number 
for modern computers. When a nontrivial term ordering is imposed the grammar 
rewriting process generates a smaller grammar and terminates more quickly. 

3 Congruence Grammars and Ground Equations 

A congruence grammar is a representation of an equivalence relation on terms. 
It turns out that those relations which can be represented by finite congruence 
grammars are exactly those relations which are the deductive closure of a finite 
set of ground equations. This section states some basic results relating congruence 
grammars and ground equations. 



First, we define the equivalence relation on (all) ground terms represented 
by a congruence grammar. This is done by defining an interning operation on 
ground terms. The word "interning" is commonly used to describe the way strings 
are mapped to variables in most programming languages. Here we are mapping 
semantically equivalent terms to the same internal data structure. We assume 
a given congruence grammar and define an interning operation such that for any 
term t, the result of interning *, denoted I[t], is a nonterminal of the grammar such 
that I[t] generates t If there is no nonterminal of the grammar which generates t 
then the grammar is extended with new productions in such a way that the desired 
nonterminal is created. 

Procedure for Interning a term t: 

• Let s u • ••, s n be the (possibly empty) sequence of immediate 
subterms of t, i.e., the terms such that t is /(a 1? • • . , s n ). 

• Let Yi, •-., K n be the result of recursively interning the terms 

*1» ••', 3 n . 

• Let X — ► /(Yi, • • • , Y n )be the production whose right hand side 
is /(yi? • * • , Y n ). If there is no such production then create one 
with a new nonterminal X and add it to the grammar. 

• Return the nonterminal symbol X. 

Since no two productions share the same right hand side the productions can 
be stored in a hash table indexed by their right hand sides. Throughout this 
section we assume the presence of such a hash table and assume that hash table 
operations can be performed in unit time. Under these assumptions the above 
intern procedure runs in linear time in the size of the given term. 

For any fixed term t and any sequence of interning operations the nonterminal 
symbol I[t] that results from interning t will be the same for each repeated interning 
of t. This allows one to think of the interning operation as a well defined function 
from terms to nonterminal symbols determined by the initial congruence grammar. 
In fact, the initial congruence grammar determines an equivalence relation on terms 
such that two terms s and t are considered to be equivalent if I[s] equals /[*]. For 
example, suppose the initial grammar contains the two productions X -* f(X) 
and X -> a and let f n (a) be an abbreviation for the term /(/(• • -/(a))) with n 
occurrences of /. In this case we have that I[f n (a)] equals X for any term of 
the form /»(«). This implies that /[(?(/»)] will be equal to I[g(f m (a))] for any 
nonnegative integers n and m. Computing Jfo(/»(a))] gives a nonterminal symbol 
Y such that the grammar has been extended to include the production Y -> g(X) 
Although interning can add productions to the grammar, it does not change the 



equivalence relation on terms denned by the grammar. Different grammars define 
different intern mappings and we will write I G [s] to mean the result of interning 
the term s with respect to the grammar G. 



Theorem: For any finite set H of equations between ground terms 
there exists a finite congruence grammar G r (H) such that that for any 
two ground terms s and t we have that H \= s = f if and only if lG r (H)[s] 
equals lG r (H)[t]. Furthermore, assuming that hash table operations can 
be done in unit time, the grammar G r (H) can be computed from H in 
n log n time where n is the written length of H. 

The proof of this theorem is based on well known algorithms for congruence closure 
[Kozen, 1977], [Shostack, 1978], [Nelson and Oppen, 1980], [Downey et a/., 1980]. A 
procedure for incrementally incorporating new equations directly into a congruence 
grammar is given in section 7. A nonoptimal algorithm for converting a set of 
equations H to a congruence grammar G r (H) can be described as follows. For 
each term s occuring in H we introduce a nonterminal symbol X,. 1 For each 
constant symbol a appearing in H we construct the production X a -+ a. For 
each term f(w u ... w k ) occuring in H we add the production X s{wu ... Wk) -► 
f(Xwi, • • • , X Wn ). Now we process the equations in H while maintaining a union- 
find structure on nonterminal symbols. For each equation s = w in H we call 
union on the nonterminals X, and X w . After processing all equations we select a 
canonical representative from each equivalence class of nonterminal symbols and 
replace every nonterminal in the grammar by its canonical representative. The 
resulting grammar need not be a congruence grammar because it is now possible 
that two distinct productions have the same right hand side. Any time we have 
two productions X - f{W u ... W n ) and Y - f(W u ... W n ) with the same 
right hand side we uniformly replace all occurances of X in the grammar by Y. 
Such replacement is continued until we have a congruence grammar. 

The transformation from equations to grammars has an inverse — one can 
transform a grammar back into a set of equations. For each nonterminal symbol 
X we construct a new constant symbol c x . The constants of the form c x will be 
called internal constants to distinguish them from the other (external) constants. 
A term that does not contain any of these internal constants will be called an 
external term. We now have the following definition and lemma. 

Definition: If G is a congruence grammar we define E q (G) to be a set 
of equations of the form c x = /(<*•„ . . . , cy n ) where G contains the 
production X -> f(Y u • • • , Y n ). 



HWe say that 8 occurs in H if s is either one side of an equation in H or 8 is a subterm of a 



term in an equation in H 



For example, if G contains the two productions X -+ f(X) and X -* a, then 
E q (G) contains the two equations ex = f(cx) and c x = a. 

Lemma: For any congruence grammar G, and any two external ground 
terms s and *, we have that I G [s] equals I G [t] if and only if E q (G) \= 
s = t. 

Proof Sketch: One can show by induction on the size of a term s that 
E q {G) \=s = c /oW . Then if I G [s] = I G [t] = X we have £ g (G) f= s = c x 
and £ 9 (G) |= t = c* so £ 9 (<3) |= 3 = *. To show the converse we add 
a production X -► c x for each nonterminal X. These productions 
do not alter the intern function on external terms. We then consider 
the congruence relation (on both internal and external terms) defined 
by the intern function for this extended grammar. This congruence 
relation provides a semantic model of EJG). So if I G \s] ^ I G \t] then 
E q (G) ¥=s = t.m 

The operation E q introduces internal constants into the equation set. These in- 
ternal constants are irrelevant to the equivalence relation on external terms denned 
by the equation set. One can define the operation G r so that it eliminates inter- 
nal constants from the grammar. The equation sets constructed by E q are useful 
for analysis and conceptual definitions but they are never actually computed. All 
computation is done directly on grammars. 

4 Grammar Rewriting 

This section defines the basic concepts of grammar rewriting. 

Definition: A grammar rewriting system is a pair <E, w> where 
E is a set of equations between first order terms (usually containing 
variables) and w is a weight function which assigns a positive integer 
to every ground term. 

The weight function induces a well founded order on ground terms by setting 
s < * if and only if w[s] < w[t]. Unlike the orderings used in traditional ordered 
rewriting systems, e.g., [Martin and Nipkow, 1990], the weight orderings used 
in grammar rewriting are not total on ground terms. In term rewriting one is 
interesting in simplifying one particular term. Grammar rewriting is also focused 
on a particular term. However, in grammar rewriting this term is represented by 
a nonterminal symbol of a grammar. 



Definition: We define the "one step" rewrite relation *-+ E on ground 
terms so that a[s] *-+ E <r[t] provided either s = t£EoTt = seE, 
every free variable of t appears in s, and a is a ground substitution. 

For example, if E is the set {g(x, f(x)) = c} then we have that g(h(a), f(h{a))) ^ E 
c but we do not have c \-+ E g(h(a), f(h(a))). 

Definition: A grammar term is a pair <X, G> where G is a finite 
congruence grammar and X is a nonterminal that appears in G. 

Definition: We say that a term s is a minimal representative of a 
grammar term <X, G>, with respect to a weight function w, if s is 
generated by X under G and no other term generated by X is smaller 
than s according to w. 

Definition: We define the one step rewrite relation *-+<&, „> on gram- 
mar terms so that 

<X, G> ^<E >vi> <X', G r (E q (G) U {u = v})> 

provided u is a subterm of some minimal representative of <X, G> 
under the weight function w, u t-+ E u, I G [u] ^ I G [v], and X' is the 
nonterminal of G r {E q {G)U{u = v}) that generates the terms generated 
by X under G. 

The one-step grammar rewriting operation defined above corresponds to se- 
lecting a term minimally generated by X, rewriting that term according to some 
equation in E, and then modifying the grammar so that the result of the rewrite 
is included in the language generated by X. A procedure for efficiently computing 
G r (E q (G) U {w - u}) from G and the equation w = u is given in section 7. This 
procedure does not construct an equation set — it performs a direct operation on 
the grammar. 

The definition of the rewrite relation on grammars requires that the new equa- 
tion being incorporated into the grammar is indeed new, i.e., it can not be an 
equation that is already implied by E q (G). This allows for the existence of normal 
forms as defined below. 

Definition: We say that a grammar term <X, G> is in normal form 
if there is no grammar term <X\ G'> such that <X G> «->^ x 
<X\ G'>. ^ ,w> 



As an example of rewriting, let C be the equation set consisting of the single 
commutative law x+y - y+x. Let 1 be the weight function that assigns every term 
the weight 1 (this weight function corresponds to the empty ordering on terms). 
Let G be the grammar consisting of the productions X 1 -► a x + X 2y X 2 -+ a 2 + X 3 , 
• • •, X n _i — ► a n _i + X n , X n — ► a n . In other words, G is the grammar generated 
by interning the term a a + (a 2 + (• • • + a n )) starting with the empty grammar. In 
the grammar rewriting system <C, 1> we have that the grammar term <X lf G> 
rewrites to a normal form <X U G'> where G' is the grammar consisting of the 
productions of the form 

Xi — * m + Xi + i 

Xi — ► Xi + i + a t - 

together with the production 

X n — > a n . 

A similar example can be given for the equation set AC consisting of the associative 
and commutative laws. 

Certain restrictions on the weight function w can be used to ensure that grammar 
rewriting always terminates. 

Definition: A grammar rewriting system <E, w> is called terminat- 
ing if there are no infinite rewrite chains of the form <X U Gi> t-^ <Ei w> 
<X 2i G 2 > »-►<£;,«£> <X 3 , G 3 > •-»<£,«,> • • •. 

Definition: A weight function w will be called a polynomial weight 
function if for each function symbol f of n arguments there exists a 
polynomial P s (x u • • • , x n ) in n variables with coefficients greater than 
or equal to 1, where each z t appears in at least one term, where there 
is a constant term of at least 1, and such that for any ground terms s u 
• • •, s n we have w[f(s u • • • , s n )] = P/fu^d, • • • , w[s n ]). 

Orderings based on polynomial weight functions are well known in the term rewrit- 
ing literature. 

Lemma: For a given finite set of function and constant symbols, and 
a given weight k, there are only finitely many terms that can be con- 
structed from those symbols that have weight less than or equal to 
k. 

Because grammar rewriting does not introduce new constant or function sym- 
bols, and because matching is restricted to minimal weight terms, we have the 
following well foundedness lemma for grammar rewriting. 



Lemma: If w is a polynomial weight function and E is any finite set 
of equations then <E, w> is terminating. 

We use i-*^^ to denote the reflexive transitive closure of the relation •-><£;,«>. 

Definition: Let I [s] be the grammar term that results from intern- 
ing s relative to the empty grammar. If I [s] K^,^ <X, G> and 
<X, G> is in normal form, then <X, G> is called a normal form of 
s. 

Definition: For any ground term s and set of equations E we define 
\s\ E to be the set of terms that can be proven to be equal to s using 
the equations in E. 

Definition: A grammar rewriting system <E, w> is called complete 
if for each ground term s, and any normal form <X, G> of s, every 
minimal member of l^l^ is generated by X under G. 

Definition: A grammar rewriting system is called canonical if it is 
terminating and complete. 

A canonical grammar rewriting system <E, w> provides a decision procedure for 
the equational theory E. 

Theorem: Let <E, w> be a canonical grammar rewriting system 
and let 3 and t be any two ground terms. Let <X, G> and <X', G'> 
be normal forms for s and t respectively under the rewriting system 
<E, w>. Let G" be the grammar that encodes all equivalences encoded 
in G or G\ i.e., G" is G r (E q (G) U E q (G% We have that E\=s = t if 
and only if I G n[s] = I G „[t]. 

This lemma follows from the invariant that for any ground term u if I [u] k->^ w 
<X, G> then X generates u under G and if <X, G> is a normal form of u then 
X generates all minimal elements of \u\ E . 



5 Examples 

We let w be the simple polynomial weight function such that for every term 
M, •••, s n ) we have that w[f(s u ..., s n )] = w[ 3l ] + . . . + w [s n ] + 1. Let 

10 



A and C be the two equation sets {x + (y + z) = (x + y) + 2} and {x + j/ = y + x} 
respectively. Let AC be A U C. One can easily verify that the systems A, C, and 
AC are all canonical under this ordering. 

Let w be any polynomial weight function satisfying w[s + 1] = 2w[s] + w[t] + 1. 
Under this weight function we have that w[s + 1] < w[t + s] provided w[s] < w[t] 
and we have w[s + (t + u)] < w[(s + 1) + tx] for any terms s, *, and w. The equation 
set AC U {x + (y + z) = y + (a: + z)} forms a canonical system where every term 
normalizes to a grammar term whose minimal representatives are the weight sorted 
permutations of the addends under a standard parenthesization. If the addends 
are linearly ordered by weight then there is only one minimal representative. 

We now consider Abelian groups. Let w be any order satisfying to[s+*] = ti;[s]+ 
w[t] + 1 and w[-s] = 2w[s]. Under this ordering we have that w[(-s) + (-*)] < 
w[-(s+t)]. The equation set AC U{-( x +y) = (_ x )+(-y), x +(-x) - 0, x+0 = 
a:} is a canonical system under this ordering. Every term normalizes to a grammar 
whose minimal representatives form an AC equivalence class. Refinements of the 
ordering can give canonical systems which generate smaller grammars. 

There seems to be little difficulty in handling the well known theories of ACI, 
groups of exponent 2, and Boolean rings where each theory can be handled with 
different weight functions that correspond to different minimal term sets. 

Let E be AC U {f(x + x) = 1}. This is given in [Martin and Nipkow, 1990] 
as an example of a theory that can not be handled by ordered term rewriting 
without special unification procedures. However, this equation set is canonical 
under grammar rewriting using the simple ordering given in the first example of 
this section. 



6 Locally Context Free Theories 

Let B (for Band) be the equation set {x*(y*z) = (x*y)*z, x*x = x}. It has been 
shown that no finite term rewriting or word rewriting system can be canonical for B 
[Baader, 1990]. However, it is possible to show that B itself is a canonical grammar 
rewriting system relative to the empty term ordering (and under fair rewriting 
as defined below). The proof of this result can be generalized and provides an 
interesting class of canonical grammar rewriting systems. Throughout this section 
we use the weight function 1 that assigns the weight 1 to all terms. This induces the 
empty ordering on terms. We start with some simple definitions and observations. 
Throughout this section we let E a fixed but arbitrary finite set of equations. 

Definition: E is called fully bidirectional if for each equation s = tin 
E the terms s and t contain the same set of variables. 

11 



Recall that a grammar rewriting system <E, w> is complete if any normal form 
of a term s represents all minimal elements of |s|#. 

Lemma: If E is fully bidirectional then <E, 1> is complete. 

If E is fully bidirectional and all terms are the same weight then grammar 
rewriting starting with a term s corresponds to unrestricted enumeration of the 
equivalence class of s. If this process leads to a normal form then that normal form 
must be a grammar that generates the entire equivalence class of s. Of course, 
for many equation sets E the system <E, 1> fails to produce any normal forms 
and the above lemma is vacuously true. However, many equation sets do generate 
normal forms under the empty term ordering. 

Definition: An equation set E is called finitary if for every ground 
term s the set \s\e is finite. 

Lemma: If E is fully bidirectional and finitary then <£, 1> is canon- 
ical. 

This follows directly form the fact that if E is finitary then the enumeration 
of the equivalence class of a term must terminate. The above lemma immedi- 
ately implies that <A, 1>, <C, 1> and <AC, 1> are all canonical. The next few 
definitions and lemmas give a more interesting class of canonical systems. 

Definition: An equation set E is called locally context free if, for every 
ground term s, the set \s\ E can be generated by a finite congruence 
grammar. 

Note that if E is locally context free then each equivalence class under E is 
a context free language in the traditional sense. If E is locally context free then 
normal forms exist. This does not imply, however, that the grammar rewriting 
system is terminating. Let E contain the three equations f(x) = x, g(x) = x, 
and f(x) = f(g(x)). E is locally context free. However, by repeatedly selecting 
only the last equation it is possible for a grammar rewrite system (under the 
empty term ordering) to run forever. We can rule out this kind of nontermination 
by considering only fair rewriting schemes. Intuitively, a rewrite system is fair 
provided that no equation is ignored indefinitely. 

Definition: An infinite chain <X U Gi> *-><*;,„> <X 2 , G 2 > «-><e,«> 
<X 3 , G 3 > ^ < je i w> ... is said to be fair if for any subterm u of a 
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minimal representative of a grammar term <X>, (?,>, if u \-* E v then 
there exists some grammar term <Xj, Gj> with j > i such that either 
u is not a subterm of minimal representative of <Xj, Gi> or / G .[«] = 
EaM- 

Definition: A grammar rewriting system <E, w> is said to termi- 
nate under fair rewriting if there is no fair infinite rewriting chain 
for <E, w>. The system <E, w> will be called canonical under fair 
rewriting if it is complete and terminates under fair rewriting. 

Lemma: If E is fully bidirectional and locally context free then <£, 1> 
is canonical under fair rewriting. 

If E is fully bidirectional and all terms have the same weight then grammar 
rewriting of a term s corresponds to the enumeration of the entire equivalence class 
of s. If this enumeration is done in a fair manner then we eventually generate every 
equation of the form u = v that is provable from E where u and v are subterms 
of terms equivalent to s. If 3 is generated by a finite grammar then this grammar 
can be written as G r (H) where H is some finite set of equations of this form. So 
the rewrite process must eventually construct this final grammar and terminate. 

To show that the equation set B is canonical it now suffices to show that it is 
locally context free. The following proof is due largely to Robert Givan and Carl 
Witty of the MIT AI Laboratory. 

Definition: An equation set E is said to be locally finite if for any finite 
set S of constant symbols the set of all terms that can be constructed 
from the constants in S and the functions in E fall into a finite number 
of equivalence classes under E. 

For example, the equation set AC U {x+x = x} has the property that for any fixed 
set of n constants, every sum that can be constructed from those n constants fall 
into 2 n - 1 equivalence classes — one equivalence class for each nonempty subset 
of the constants. 

It is interesting to note that E is locally finite if and only if for any finite set S 
the free ^-algebra generated by S is finite. 

Lemma: If E is locally finite and fully bidirectional then E is locally 
context free. 
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Consider a term s and consider a term t in \s\ E . If E is fully bidirectional 
then every constant that appears in t must appear in either s or E. Since E is 
locally finite there are only a finite number of equivalence classes of such terms. By 
creating a nonterminal symbol for each equivalence class one can show construct a 
grammar for generating the equivalence class of s. Given that a grammar exists, it 
can be computed using the fact that <E, 1 » is canonical whenever E is locally 
context free. 

The converse of the above lemma does not hold. There are locally context free 
equation sets which are not locally finite. The empty equation set is an example. 

One can use the infinite canonical word rewrite system for B given in [Siekmann 
and Szabo, 1982] (also described in [Baader, 1990]) to prove that B is locally finite 
— for any set of n constants there exists a k such that all words built the given 
constants can be simplified to a word no longer than k. The above lemma now 
implies that <£?, 1> is canonical under fair rewriting. 

7 Incorporating Equations into Grammars 

This section gives an algorithm for computing G r (E q (G) U {s = t}) from the 
grammar G and the equation s = t. The grammar G r (E q (G) U {s = t}) is directly 
computed from G by incrementally adding and removing productions. The algo- 
rithm given here is quite similar to the congruence closure procedure described in 
[Nelson and Oppen, 1980]. The algorithm has been reformulated here to operate 
on grammars and optimized to run in order nlogra time (under the assumption 
that hash table operations take unit time). 2 

This procedure uses the internal constants of the form c z that are associated 
with the nonterminals of the grammar. The procedure maintains an equivalence 
relation represented by three sets of equations. First, the procedure maintains 
a congruence grammar. Second, the procedure maintains a queue of equations 
of the form c z = cy. Third, an additional set of equations between constants 
of the form c z is maintained in a union-find structure on these constants. The 
equivalence relation determined by these three sets of equations is maintained as 
a fixed invariant of the procedure. 

As mentioned above, equations between constant symbols can be handled by 
the well known union-find procedure. 3 Consider a set of equations a r = ^ a 2 = 
&2, • • • , a n = 6 n . This set of equations can be represented in a union-find structure 

2 A somewhat different nlogn algorithm for congruence closure is described in [Downey et a/., 
i y ouj . 

3 A description of the union-find protocol and efficient union-find algorithms can be found in 
most modern algorithms texts. 
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by executing union^, i^), union(a 2 , 62), •• •, union(a n , 6 n ). To see if an equation 
c = d follows from the given equations one computes f ind(c) and f ind(rf). The 
equation c = d is provable if and only if these two find operations return the same 
value. 

In the following procedure we assume that the find operation is such that 
f ind(cx) is a canonical member of the equivalence class of c x relative to the 
equivalence relation encoded in the union-find structure. In other words, f ind(cy) 
is a constant cy such that f ind(cy) equals cy. We also assume that when two 
equivalence classes are merged with a union operation the canonical representative 
of the resulting equivalence class is selected to be one of the two previous canonical 
representatives. 

Definition: A constant c x is said to be dead if f ind(c^) is some 
constant other than c x . A constant that is not dead is said to be alive, 
i.e., a constant cy is alive if f ind(cy) equals cy. 

Procedure for computing G r (E q (G) U {s = t}): 

1. Let Z and W be the nonterminals I G [s] and 7gM- 

2. Initialize S to be a queue containing the single equation c z = c w . 

3. While the queue S is not empty do the following. 

(a) Remove an equation c z = c w from the queue S. 

(b) Let c x be f ind(c^) and let cy be f ind(c^). 

(c) K c x is the same symbol as cy then do nothing, otherwise: 

(d) Call union(cy, cy). 

(e) Swap the roles of X and Y if necessary so that c x is dead and 
cy is still alive. 

(f) Let V be the set of all productions involving X. 

(g) Remove all the productions in V from the grammar. 

(h) For each production Z -► f(W u • . . W n ) in V do the follow- 
ing. 

i. Let c z , be f ind(c z ) and for each W { let c w , be f ind(c Wi ) 
ii. If there is no production whose right hand side is f(W{, •••, W) 

then add the production Z' -> f(W[, • • • , W„). 
iii. If there is already a production U -> f(W{, •••, W) 
where U is different from Z' then add the equation cu = 
cz> to the queue S. 
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This procedure maintains the invariant that no two productions in the grammar 
have the same right hand side, i.e., the grammar is always a congruence grammar. 
Furthermore, the equivalence relation encoded in the equations in the grammar, 
the queue, and the union-find procedure is maintained as a fixed invariant. To 
check this one must check that every added equation is derivable from previous 
equations and that every equation removed in step g is derivable from previous 
equations plus those equations added in step h. The procedure also maintains the 
invariant that every nonterminal in every production is "alive", i.e., if X appears 
in a production of the grammar then f ind(cx) equals c x . Every nontrivial ex- 
ecution of the main loop reduces the number of living nonterminal symbols, so 
the procedure must terminate. Furthermore, when the procedure terminates the 
equational theory enoded in the union-find structure can be dropped without al- 
tering the induced equivalence relation on external terms. To prove this consider 
an extended grammar that includes all productions of the form X -* cy where 
X is a living nonterminal and f ind(cy) is c x . The equation set of the extended 
grammar encodes the equivalence relation of the original grammar plus the equiv- 
alences in the union find structure. However, this extended grammar encodes the 
same intern function on external terms as the unextended grammar. 

If we assume a bound on the number of arguments taken by function symbols, 
e.g., no function takes more than three arguments, and assume that hash table 
operations can be performed in unit time, then under an appropriate implemen- 
tation of union-find it can be shown that the above procedure terminates in order 
nlogn time where n is the number of productions in the original grammar. In 
practice the incorporation of a single new equation into a large grammar requires 
the manipulation of only a small subset of the grammar. 

8 A Grammar Rewriting Algorithm 

In this section we consider a fixed but arbitrary grammar rewriting system <E, w>, 
where w is a polynomial weight function, and give a procedure for computing all 
the ways in which a given grammar term can be rewritten under <E, w>. The 
procedure is incremental so that if <X, G> ^<e iW> <X\ G'> then the set of 
possible ways of rewriting <X\ G'> can be computed incrementally from the set 
of possible ways of rewriting <X, G>. Incremental procedures can be defined by 
"inference rules" that are run in an incremental forward chaining manner. We first 
give rules for deriving the weight of nonterminal symbols as defined below. 

Definition: For each nonterminal Y of a congruence grammar the 
weight of F, denoted w\Y], is minimum weight of all terms generated 
by Y. 
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The following inference rule can be used to propagate bounds on weights. 
w[Xi] < W{ 

w[X n ] < w n 

Z-+f(X u ...,x n ) 



w[Z]<P f {w u ...,w n ) 



For constant symbols (functions of no arguments) the above inference rule can 
be used to generate a weight bound directly from a production of the form Z -► c. 
Weight bounds generated by constants can then be propagated to other symbols. 
One can show that, for any given nonterminal Y, the tightest bound that can be 
derived for w[Y] using the above rule is in fact the weight of Y as defined above. 
In practice, simply running this rule over the entire grammar until the tightest 
bounds are derived seems to be an acceptable incremental algorithm for computing 
the weight of nonterminals. However, the theoretical worst case behavior of this 
this algorithm is quite bad. An n log n algorithm can be derived by placing derived 
bounds on a priority queue and processing the tightest bounds first. 

Procedure to compute the weight of all nonterminals: 

1. Initialize S to be the priority queue containing all pairs of the 
form <y, w> where the grammar contains the production Y -+ a 
where a is a constant and it; is the weight of a. 

2. Until S is empty, or until every nonterminal has been assigned a 
weight, do the following. 

(a) Remove a pair <Y, w> from S such that w is the minimum 
weight of all pairs on S. 

(b) If Y has already been assigned a weight do nothing. Other- 
wise: 

(c) Assign Y the weight w. 

(d) For each production Z -► f(W u • • • , W n ) such that some W, 
is Y and such that each Wi has been assigned a weight to,-, 
add the pair <Z, P f (w u • • • , w n )> to the queue S. 

Each pair added to the queue has a larger weight than the last pair removed 
from the queue. This implies that the pairs removed from the queue have mono- 
tonically increasing weight. This, plus the assumed properties of the polynomial 
weights, implies that for each nonterminal symbol Y the weight assigned to Y is 
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the minimum over all the productions from Y of the weight computed from that 
production. One can check that by selecting productions that minimize weight 
that if Y has been assigned weight w then Y generates a term of weight w. Fur- 
thermore, one can prove by induction on the weight of a term s that the weight of 
s is at least as large as the weight assigned to I[s], the symbol that generates s. 
Since the procedure only assigns a single weight to each nonterminal symbol the 
number of pairs placed on the priority queue is linear in the size of the grammar. 
Order n insertions and deletions from a priority queue can be done in n log n time. 
Assuming a bound on the number of arguments taken by any function symbol, the 
other operations in this procedure can be performed in order n time so the total 
running time is order n log n. 

The next step is to identify those nonterminals in the grammar that generate 
subterms of minimal representatives of <X, G>. 

Definition: A production Z -► f(X u • • • , X n ) is minimal if w[Z] = 
P,WXi], ~ ' , »[Xn]). 

Lemma: If s is a minimal representative of the grammar term < X, G> 
then s is generated by X in that subset of G which consists of just the 
minimal productions of G. 

Definition: We say that a nonterminal symbol Y is a minimal subterm 
nonterminal of a grammar term <X, G> if either Y is X (X is a 
minimal subterm nonterminal) or G contains a minimal production 
W —> f(Z u • • • , Z n ) where W is a minimal subterm nonterminal and 
some Z{ is Y. 

As an example consider the grammar term <X, G> where G consists of the 
five productions 

X-+f(Y), Y->g(Z), Z-+a 
Z->h(W), W->f(Z). 
The production Z -► h(W) is not minimal — it does not provide a smallest 
term generated by Z. However, all other productions are minimal, including the 
production W -+ f(Z) which provides a smallest term generated by W. However, 
the nonterminal W is not a minimal subterm nonterminal — it does not generate 
a subterm of a minimal representative of <X, G>. 

The procedure for constructing matches uses a simple extension of the grammar. 

Definition: An extended congruence grammar is a congruence gram- 
mar such that, for each nonterminal X appearing in a production of 
the grammar, the grammar also contains the production X -► c x . 
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We define a match to be a triple, matchjY; s, <r], where Y is a nonterminal, s 
is a term, and a is a substitution such that I[<r[s\] is Y. Matches can be computed 
by starting with "basic matches" and creating new matches according to inference 
rules that generate new matches from old matches. For each minimal subterm 
nonterminal Y, and for each variable x occurring in E, we create the basic match 
match[F, x, {x i-> cy}]. To minimize the number of basic matches it is important 
to rename the variables in equations in E to minimize the total number of vari- 
ables in E. Typically there will be no more than 3 or 4 variables in E. We also 
create basic matches for constant symbols that appear in E. For each constant a 
appearing in E we create the basic match match[Jo[a], a, 0] where is the empty 
substitution. More complicated matches can be constructed using the following 
rule. 

matchpi, 3i t Ti] 

match[y n , 3 n , r n ] 

z-*fiy u ...,y n ) 

match[Z, f(s u ••, * n ), a] 

The substitutions are represented by finite lists of variable- value pairs. In all 
the substitutions constructed by the matching procedure the values assigned to 
variables are always internal constants. The above rule only applies when the 
substitutions r, agree on all shared variables — if Ti and r, both contain a pair 
assigning a value to the variable x then they must both assign x to the same 
internal constant. The substitution a is simply the union of the r t 's. The above 
rule is also restricted to the case where Z is a minimal subterm nonterminal, the 
production Z -+ f(Y u • . • , Y n ) is minimal, and f(s u • • . , s n ) is a term occurring 
in E. 

The restrictions on the above inference rule are "nonmonotonic" . The addi- 
tion of a new production can cause other productions to go from being minimal 
to being nonminimal. When matches are computed incrementally as new produc- 
tions are added it is possible that previously computed matches become "obsolete" 
in the sense that they were computed from productions that have now become 
nonminimal. 4 The failure to remove matches that are obsolete due to productions 
that are no longer minimal will cause minor overgeneration of matches. The time 
required to detect and r emove these obsolete matches is probably greater than the 

'Matches can also become obsolete if nonterminals or constant symbols involved in the match 
become dead" due to generated equations between nonterminals. Matches that are obsolete due 
to references to dead nonterminals or dead constants are easily detected and eliminated 
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time taken to perform any extra rewrites they cause. Rewrites generated by these 
obsolete matches are semantically sound and do no harm. 

Finally, we construct "equate relations" of the form cz *-+e v where cz is 
an internal constant and v is a ground term involving internal constants. More 
specifically, if we derive match[Z, s, a\ and E contains either s = t or t = s 
where every free variable in t occurs in s, then we can derive c z *-*e <*■[*]. The set 
of derived equate relations of the form c z *-+ E v provide all the possible ways of 
rewriting the grammar. For each equate relation cz *->e u, the grammar can be 
rewritten by using the procedure of section 7 to equate c z and v. 



9 Summary 

Grammar rewriting is motivated by the desire to increase the success rate of at- 
tempts to prove equations in nonconfluent rewrite systems. Intuitively, the rewrite 
process generates a set of terms rather than an individual term, and by generating 
two sets of terms, rather than two classical normal forms, we increase the likeli- 
hood of proving the desired equation. Although grammar rewriting is primarily 
motivated by the need to handle nonconfluent theories, it also provides a new 
kind of canonical rewriting system. Under grammar rewriting there exist finite 
canonical systems for equational theories, such as idempotent semigroups, that 
do not have finite canonical systems under traditional notions of rewriting. The 
pragmatic value of grammar rewriting in general purpose provers, such as [Boyer 
and Moore, 1979], has not yet been adequately investigated. 
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